Route Distribution Incentives 



Joud Khoury 1 , Chaouki T. Abdallah 1 , Kate Krause 2 , and Jorge Crichigno 1 

1 ECE Department, MSC01 1100, 

1 University of New Mexico, Albuquerque NM 87131 
{j khoury, chaouki, jcrichigno}@ece .unm. edu 

2 Economics Department, University of New Mexico 
1915 Roma NE/Economics Bldg., Albuquerque NM 87131 

kkrause@unm . edu 



Abstract. We present an incentive model for route distribution in the 
context of path vector routing protocols and we focus on the Border 
Gateway Protocol (BGP). BGP is the de-facto protocol for interdomain 
routing on the Internet. We model BGP route distribution and compu- 
tation using a game in which a BGP speaker advertises its prefix to its 
direct neighbors promising them a reward for further distributing the 
route deeper into the network, the neighbors do the same thing with 
their neighbors, and so on. The result of this cascaded route distribu- 
tion is an advertised prefix and hence reachability of the BGP speaker. 
We first study the convergence of BGP protocol dynamics to a unique 
outcome tree in the defined game. We then proceed to study the exis- 
tence of equilibria in the full information game considering competition 
dynamics. We focus our work on the simplest two classes of graphs: 1) 
the line (and the tree) graphs which involve no competition, and 2) the 
ring graph which involves competition. 



1 Introduction 

The Border Gateway Protocol (BGP) is a policy-based path vector protocol and 
is the de-facto protocol for Internet interdomain routing. The protocol's speci- 
fication |17| was initially intended to empower domains with control over route 
selection, and route propagation. The commercialization of the Internet trans- 
formed Autonomous Systems (AS) into economic entities that act selfishly when 
implementing their internal policies and particularly the decisions that relate 
to route selection and propagation [3]. BGP is intrinsically about distributing 
route information to destinations (which are IP prefixes) to establish paths in 
the network. Path discovery, or simply discovery hereafter, starting with some 
destination prefix is the outcome of route distribution and route computation. 

Accounting for and sharing the cost of discovery is an interesting problem 
whose absence from current path discovery schemes has led to critical economic 
and scalability concerns. As an example, the BGP control plane functionality is 
oblivious to cost. A node (BGP speaker) that advertises a provider-independent 
prefix (identifier) does not pay for the cost of being discoverable. Such a cost, 
which may be large given that the prefix is maintained at every node in the 



Default Free Zone (DFZ), is paid by the rest of the network. For example, Her- 
rin [10] has preliminarily analyzed the non-trivial cost of maintaining a BGP 
route. Such incentive mismatch in the current BGP workings is further exacer- 
bated by provider-independent addressing, multi-homing, and traffic engineering 
practices [15] . Given the fact that the number of BGP prefixes in the global rout- 
ing table (or RIB) is constantly increasing at a rate of roughly 100, 000 entries 
every 2 years and is expected to reach a total of 388,000 entries in 2011 [TTj . 
has motivated us to devise a model that accounts for distribution incentives in 
BGP. 

A large body of work has focused on choosing the right incentives given that 
ASes are self-interested, utility-maximizing agents. While exploring incentives, 
most previous work has ignored the control plane incentives □ (route advertise- 
ment/distribution) and has instead focused on the forwarding plane incentives 
(e.g. transit costs). One possible explanation for this situation is based on the 
following assumption: a node has an incentive to distribute routes to destinations 
since the node will get paid for transiting traffic to these destinations, and hence 
route distribution is ignored as it becomes an artifact of the transit process. We 
argue that this assumption is not economically viable by considering the arrival 
of a new customer (BGP speaker). While the servicing edge provider makes 
money from transiting the new customer's traffic to the customer, the middle 
providers do not necessarily make money while still incurring the cost to main- 
tain and distribute the customer's route information. In this work, we separate 
the control plane incentives (incentives to distribute route information) from the 
forwarding plane incentives (incentives to forward packets) and use game theory 
to model a BGP distribution game. The main problem we are interested in is 
how to allow BGP prefix information to be distributed globally while aligning 
the incentives of all the participating agents? 

A Simple Distribution Model We synthesize many of the ideas and results 
from [911216115] into a coherent model for studying BGP route distribution in- 
centives. Influenced by the social network query propagation model of Klcinbcrg 
and Raghavan [12], we use a completely distributed model in the sense that it 
docs not assume a central bank (in contrast to previous work on truthful mecha- 
nisms [16j). A destination d advertises its prefix and wishes to invest some initial 
amount of money ra in order to be globally discoverable (or so that the informa- 
tion about d be globally distributed). Since d may distribute its information to 
its direct neighbors only, d needs to provide incentives to get the information to 
propagate deeper into the network. Therefore, d must incentivize its neighbors 
to be distributors of its route, who then incentivize their neighbors to be dis- 
tributors, and so on. While we take BGP as the motivating application, we are 
interested in the general setting of distributing a good to a set of agents. Agents 
are located on a network and trade may only occur between directly connected 



3 In this paper, we use the term "control plan" to refer only to route prefix adver- 
tisements (not route updates) as we assume that the network structure is static. 



agents. Prices are chosen strategically and the agents are rewarded by volume 
of sales. 

Our Results A general model for studying BGP was originally defined by Griffin 
et. al in [5] and later by Levin et. al in JT3]. In section [21 We build upon this 
general model to define the BGP distribution game and the main goal of this 
paper is to study the existence of equilibria in the defined game. Studying the 
equilibria for arbitrary graph structures is not an easy problem given the com- 
plexity of the strategic dependencies and the competition dynamics. As we are 
not aware of general existence results that apply to our game, we initially focus 
on the simplest two classes of graphs: 1) the line (and the tree) graphs which 
involve no competition, and 2) the ring graph which involves competition. We 
assume full information as we are interested in studying the existence question 
initially rather than how the players would arrive at the equilibrium. 
We show that a subgame perfect equilibrium always exists for the game induced 
on the line graph (and on the tree), while no such equilibrium exists for the 
game induced on the ring graph due to oscillation of best-response dynamics 
under competition. While the full game does not have a subgame perfect equi- 
librium, we show that there always exists a Nash equilibrium for a special class 
of subgames. This requires us to first quantify the growth of rewards, or in other 
words the minimum incentive r^ such that there exists an equilibrium outcome 
which is a spanning tree (i.e. d is globally discoverable). 

Related work The Simple Path Vector Protocol (SPVP) formalism [9] develops 
sufficient conditions for the outcome of a path vector protocol to be stable. A 
respective game-theoretic model was developed by Levin [T3] that captures these 
conditions in addition to incentives in a game theoretic setting. Feigenbaum et. 
al study incentive issues in BGP by considering least cost path (LCP) policies [5] 
and more general poilicies [6j . Our model is fundamentally different from [5j (and 
other works based in mechanism design [18]) in that the prices are strategic, 
the incentive structure is different, and we do not assume the existence of a 
central "designer" (or bank) that allocates payments to the players but is rather 
completely distributed as in real markets. The bank assumption is limiting in a 
distributed setting, and an important question posed in [6] is whether the bank 
can be eliminated and replaced by direct payments by the nodes. A desirable 
property of our model is that payments are bilateral and may only flow between 
neighbors where a player i should not be able to send a payment to another 
player j unless the latter is a direct neighbor. This renders the model more 
robust to manipulation. 

Li et. al [14| study an incentive model for query relaying in peer-to-peer (p2p) 
networks based on rewards, upon which Kleinberg et. al |T2j build to model a 
more general class of trees. In [12], Kleinberg and Raghavan allude to a similar 
version of our distribution game in the context of query incentive networks. 
They pose the general question of whether an equilibrium exists for general 
Directed Acyclic Graphs (DAGs) in the query propagation game. Both of these 
probabilistic models do not account for competition. While we borrow the basic 



idea, we address a different probiem which is that of route distribution versus 
information seeking. 

Finally, our work relates to price determination in network markets with 
intermediaries (refer to the work by Blume et al. [5] and the references therein). 
A main differentiator of this class of work from other work on market pricing 
is its consideration of intermediaries and the emergence of prices as a result of 
strategic behavior rather than competitive analysis or truthful mechanisms. Our 
work specifically involves the cascading of traders (or distributors) on complex 
network structures. 

2 The General Game 

Reusing notation from [6113] , we consider a graph G = (V, E) where V is a set 
of n nodes (alternatively termed players, or agents) each identified by a unique 
index i = {1, . . . ,n}, and a destination d, and E is the set of edges or links. 
Without loss of generality (WLOG), we study the BGP discovery/route dis- 
tribution problem for some fixed destination AS with prefix d (as in |9I6I13| ). 
The model is extendable to all possible destinations (BGP speakers) by notic- 
ing that route distribution and computation are performed independently per 
prefix. The destination d is referred to as the advertiser and the set of players 
in the network are termed seekers. Seekers may be distributors who participate 
in distributing ePs route information to other seeker nodes or consumers who 
simply consume the route (leaf nodes in the outcome distribution tree). For each 
seeker node j, Let P(j) be the set of all routes to d that are known to j through 
advertisements, P(j) Q V(j), the latter being the set of all simple routes from 
j. The empty route (j) £ V(j). Denote by Rj €E P{j) a simple route from j to 
the destination d with Rj = <p when no route exists at j, and let (k 7 j)Rj be the 
route formed by concatenating link (k,j) with Rj, where (k,j) <E E. Denote by 
B(i) the set of direct neighbors of node i and let next(Ri) be the next hop node 
on the route Ri from i to d. Define node j to be an upstream node relative to 
node i when j 6 Ri. The opposite holds for a downstream node. Finally, we use 
r next(Ri) to refer to the reward that the upstream parent from i on Ri offers to 
i. 

The general distribution game is as follows: destination d first exports its 
prefix (identifier) information to its neighbors promising them a reward r ( i € Z + 
which directly depends on d's utility of being discoverable. A node j (a player) 
in turn receives offers from its neighbors where each neighbor i's offer takes the 
form of a reward r,j. A reward r,j that a node i offers to some direct neighbor 
j G P>(i) is a contract stating that i will pay j an amount that is a function of 
Tij and of the set of downstream nodes k that decide to route to d through j 
(i.e. j E Rk and Rj = (j,i)Ri). After receiving the offers, player j strategizes by 
selecting a route among the possibly multiple advertised routes to d, say {j, i)Ri, 
and deciding on a reward Tji < Tij to send to each candidate neighbor I e B(j) 
that it has not received a competing offer from. Note then that ry < Tji where 
rij = means that j did not receive an offer from neighbor /. Node j then 



pockets the difference — rji. The process repeats up to some depth that is 
directly dependent on the initial investment as well as on the strategies of the 
players. We intentionally keep this reward model abstract at this point, but will 
revisit it later in the discussion when we define more specific utility functions. 
Clearly in this model, we assume that a player can strategize per neighbor, 
presenting different rewards to different neighbors. This assumption is based on 
the autonomous nature of the nodes and the current practice in BGP where 
policies may differ significantly across neighbors (as with the widely accepted 
Gao-Rcxford policies [5] for example). 

Assumptions To keep our model tractable, we take several simplifying assump- 
tions. In particular, we assume that: 

1. the graph is at steady state for the duration of the game i.e. we do not 
consider topology dynamics; 

2. the advertiser d docs not differentiate among the different players (ASes) in 
the network i.e. the ASes arc indistinguishable to d. 

3. the advertised rewards are integers and are strictly decreasing with depth i.e. 
rij G Z + and < r next ^.),\/ We let 1 unit be the cost of distribution 
(a similar assumption was taken in |12j to avoid the degenerate case of never 
running out of rewards, referred to as "Zeno's Paradox"); 

4. a node that does not participate will have a utility of zero; 

5. finally, our choice of the utility function isolates a class of policies which we 
refer to as the Highest Reward Path (HRP). As the name suggests, HRP 
policies inccntivize players to choose the path that promises the highest 
reward. Such class of policies may be defined more generally to account for 
more complex cost structures as part of the decision space Q We assume for 
the scope of this work that transit costs are extraneous to the model. This 
is a restrictive assumption given that BGP allows for arbitrary and complex 
policies that are generally modeled with a valuation or preference function 
over the different routes to d (check |9|6j ). 

Strategy Space: Given a set of advertised routes P(i) where each route Ri G P(i) 
is associated with a promised reward r next (R.} G Z + , a pure strategy Sj G Si of 
an autonomous node i comprises two decisions: 

— After receiving offers from neighboring nodes, pick a single "best" route 
Ri G P(i) (where "best" is defined shortly in Theorem [I}; 

— Pick a reward vector = [r^-Jj promising a reward r,y to each candidate 
neighbor j (and export route and reward to respective candidate neighbors) . 



4 Metric based policies could be modeled with HRP by fixing one of the players' 
decisions. For example, fixing nj = r next ( Ri ) — 1, Vi,j results in hop count metric; or 
alternatively setting nj = r next ^ R .) — a, where a is some local cost to the node results 
in Least Cost Path (LCP) policy [B], etc. 



A strategy profile s = (si, . . . , s n ) and a reward define an outcome of the 
game Q. Every outcome determines a set of paths to destination d given by 
Od = (Ri, ■ ■ ■ , R n )- A utility function itj(s) for player i associates every outcome 
with a real value in K. We use the notation s_ 2 ; to refer to the strategy profile 
of all players excluding i. The Nash equilibrium is defined as follows: 

Definition 1 A Nash Equilibrium (NE) is a strategy profile s* = (sj, . . . , s* ) 
such that no player can move profitably by changing her strategy, i.e. for each 
player i, tij(s*,s* J > Uj(si,s* J, Vs; € 5». 

Cost: The cost of participation is local to the node and includes for example the 
cost associated with the effort that a node spends in maintaining the route infor- 
mation 0. Other cost factors that depend on the volume of traffic (proportional 
to the number of downstream nodes in the outcome Od) are more relevant to 
the forwarding plane and as mentioned earlier in the assumptions, we ignore this 
cost in the current model. Hence, we simply assume that every player i incurs a 
cost Ci which is the cost of participating. We assume for the scope of this paper 
that the local cost is constant with Cj = c = 1. 

Utility: We experiment with a simple class of utility functions which rewards 
a node linearly based on the number of sales that the node makes. This model 
incentivizes distribution and potentially requires a large initial investment from 
d. More clearly, define iVj(s) = {j £ V\{ON £ Rj} to be the set of nodes 
that pick their best route to d going through i (nodes downstream of i) and let 
8i(s) = \Ni(s)\. Let the utility of a node i from an outcome or strategy profile s 
be: 

Ui(s) = (r n ext(Ri) ~ Ci) + ^2 (rnexHR,) ~ r i3 )(5 3 (s) + 1) (1) 

{j\i=next(Rj)} 

The first term (r nex t(RA —ci) of ([I} is incurred by every participating node and is 
the one unit of reward from the upstream parent on the chosen best path minus 
the local cost. Based on the fixed cost assumption, we often drop this first term 
when comparing player payoffs from different strategies since the term is always 
positive when c = 1. The second term of ( [T]) (the summation) is incurred only 
by distributors and is the total profit made by i where {r nex t(RA — r ij)(^j{ s ) + 1) 
is i's profit from the sale to neighbor j (which depends on Sj). A rational selfish 
node will always try to maximize its utility by picking Sj = (i?j, There is 

an inherent tradeoff between {r nex t(Ri) ~ r ij) an d {b~j{ s )) s -t- i = next(Rj) when 



J We abuse notation hereafter and we refer to the outcome with simply the strategy 
profile s where it should be clear from context that an outcome is defined by the tuple 
< s, rd >■ Notice that a strategy profile may be associated with an outcome if we model 
rd as an action. We refrain from doing so to make it explicit that is not strategic. 

6 A preliminary estimate of this cost is shown by Herrin [TD] to be $0.04 per 
route/router/year for a total cost of at least $6,200 per year for each advertised route 
assuming there are around 150,000 DFZ routers that need to be updated. 



trying to maximize the utility in Equation (TTJ) in the face of competition as shall 
become clear later. A higher promised reward rij allows the node to compete 
(and possibly increase Sj) but will cut the profit margin. Finally, we implicitly 
assume that the destination node d gets a constant marginal utility of for 
each distinct player that maintains a route to d - the marginal utility of being 
discoverable by any seeker - and declares r c i truthfully to its direct neighbors 
(i.e. rd is not strategic). 

Convergence under HRP: Before proceeding with the game model, we first 
prove the following theorem which results in the Highest Reward Path (HRP) 
policy. 

Theorem 1 In order to maximize its utility, node i must always pick the route 
Ri with the highest promised reward i.e. such that r next m.\ > r^e^r^,), V Ri € 
P(i). 

The proof of Theorem [T] is given in Appendix [XJ The theorem implies that a 
player could perform her two actions sequentially, by first choosing the highest 
reward route Ri, then deciding on the reward vector to export to its neighbors. 
Thus, we shall represent player i's strategy hereafter simply with the rewards 
vector [r^] and it should be clear that player i will always pick the "best" route 
to be the route with the highest promised reward. When the rewards are equal 
however, we assume that a node breaks tics consistently. 

The question we attempt to answer here is whether the BGP protocol dynamics 
converge to a unique outcome tree under some strategy profile s. A standard 
model for studying the convergence of BGP protocol dynamics was introduced 
by Griffin et al. [SJ, and assumes BGP is an infinite round game in which a 
scheduler entity decides on the schedule i.e. which players participate at each 
round (models the asynchronous operation of BGP). The authors devised the 
"no dispute wheels" condition [9] , which is the most general condition known to 
guarantee convergence of possibly "conflicting" BGP policies to a unique stable 
solution (tree) . From Theorem[TJ it may be easily shown that "no dispute wheels" 
exist under HRP policy i.e. when the nodes choose highest reward path breaking 
ties consistently. This holds since any dispute wheel violates the assumption of 
strictly decreasing rewards on the reward structure induced by the wheel. Hence, 
the BGP outcome converges to a unique tree T d [5] under any strategy profile s. 
This result allows us to focus on the existence of equilibria as it directly means 
that the BGP protocol dynamics converges to a tree under any equilibrium 
strategy profile. 

2.1 The Static Multi-Stage Game with fixed schedule 

Again, for the scope of this paper, we restrict the analysis of equilibria to the 
simple line and ring graphs. In order to apply the correct solution concept, 
we fix the schedule of play (i.e. who plays when?) as we formalize shortly. We 
examine a static version of the full-information game in which each player plays 



once at a particular stage as determined by its proximity to d. The schedule 
is based on the inherent order of play in the model: recall that the advertiser 
d starts by advertising itself and promising a reward r^; the game starts at 
stage 1 where the direct neighbors of d, i.e. the nodes at distance 1 from d, 
observe and play simultaneously by picking their rewards while the rest of 
the nodes "do-nothing". At stage 2, nodes at distance 2 from d observe the stage 
1 strategies and then play simultaneously and so on. Stages in this multi-stage 
game with observed actions [7J have no temporal semantics. Rather, they identify 
the network positions which have strategic significance. The closer a node is to 
the advertiser, the more power such a node has due to the strictly decreasing 
rewards assumption. The key concept here is that it is the information sets [JJ 
that matter rather than the time of play i.e. since all the nodes at distance 1 
from d observe r d before playing, all these nodes belong to the same information 
set whether they play at the same time or at different time instants. We refer 
to a single play of the multi-stage game as the static game. We resort to the 
multi-stage model (the fixed schedule) on our simple graphs to eliminate the 
synchronization problems inherent in the BGP protocol and to focus instead on 
the existence of equilibria. By restricting the analysis to the fixed schedule, we 
do not miss any equilibria. This is due to the fact that the fixed schedule is 
only meant to replace the notion of "fair and infinite schedule" [5] with a more 
concrete order of play. The resulting game always converges in a single play for 
any strategy profile, and the outcome tree is necessarily one of shortest-paths 
(in terms of number of hops) . The main limitation of this model however is that 
it can not deal with variable costs a for which the outcome (HRP tree) might 
not be a shortest-path tree. 

Formally, and using notation from [7J , each player i plays only once at stage 
k > where k is the distance from i to d in number of hops. At every other 
stage, the player plays the "do nothing" action. The set of player actions at stage 
k is the stage-fc action profile, denoted by a k = (a k 7 , . . . ,a k ). Further, denote 
by h k+1 = (r<j, a 1 , . . . , a fc ), the history at the end of stage k which is simply the 
initial reward concatenated with the sequence of actions at all previous stages. 
We let h 1 = (r d ). Finally, h k+1 C H k+1 the latter being the set of all possible 
stagc-fc histories. When the game has a finite number of stages, say K + 1, then 
a terminal history h K+1 is equivalent to an outcome of the game (which is a tree 
Trf) and the set of all outcomes is H K+1 . 

The pure-strategy of player i who plays at stage k > is a function of the history 
and is given by Si : H k — » R mi where m, is the number of direct neighbors of 
player i that are at stage k + 1 (implicit here is that a player always picks the 
highest reward route). Starting with r<i (which is h 1 ), it is clear how the game 
produces actions at every later stage based on the player strategies resulting 
in a terminal action profile or outcome. Hence, given r^, an outcome in H K+1 
may be associated with every strategy profile s, and so the definition of Nash 
equilibrium (Definition ([1])) remains unchanged. Finally, it is worthwhile noting 
that the "observed actions" requirement (where a player observes the full history 
before playing) is not necessary for our results in the static game as we shall see 



in the construction of the equilibrium strategies. Keeping this requirement in 
the model allows us to classify the play from some stage onward, contingent on 
a history being reached as a subgame in its own right as we describe next. 

Definition 2 JTjj A proper subgame of a full game is a restriction of the full 
game to a particular history. The subgame inherits the properties of the full 
game such as payoffs and strategies while simply restricting those to the history. 

In our game, each stage begins a new subgame which restricts the full game to 
a particular history. For example, a history h k begins a subgame G{h k ) such 
that the histories in the subgame are restricted to h k+1 = (h k ,a k ), h k+2 = 
(h k , a k , a k+1 ), and so on. 

Definition 3 A strategy profile s* = (s*, . .., s*) is a subgame-perfect equi- 
librium if it is a Nash equilibrium for every proper subgame of the full game. 

Hereafter, the general notion of equilibrium we use is the Nash equilibrium and 
we shall make it clear when we generalize to subgame perfect equilibria. We 
are only interested in pure-strategy equilibria [7] and in studying the existence 
question as the incentive r^ varies. We now proceed to study the equilibria on 
special networks. 



3 Equilibria on the Line Graph, the Tree, and the Ring 
Graph 

In the general game model defined thus far, the tie-breaking preferences of the 
players is a defining property of the game, and every outcome (including the 
equilibrium) depends on the initial reward/utility rd of the advertiser. In the 




Fig. 1. (a) Line graph: a player's index is the stage at which the player plays; d 
advertises at stage 0; K = n; |(b)| Ring graph with even number of players: (i) 
2-stage game, (ii) 3-stage game, and general (iii) if-stage game. 



same spirit as [12] we inductively construct the equilibrium for the line graph 



(simply referred to as the line hereafter) of Figure 1 (a) given the utility function 
of Equation |T]) . We present the result for the line which may be directly extended 
to trees. Before proceeding with the construction, notice that for the line, ra, = 1 
for all players except the leaf player since each of those players has a single 
downstream neighbor. In addition, 8i(s) = 5j(s) + l,Vi,j where j is i's child 
(Si = when i is a leaf). We shall refer to both the player and the stage using 
the same index since our intention should be clear from the context. For example, 
the child of player i is i + 1 and its parent is i — 1 where player i is the player at 
stage i. Additionally, we simply represent the history h k+1 = (r^) for k > where 
rfe is the reward promised by player k (player fc's action). The strategy of player 
k is therefore Sk(h k ) = Sfe(rfe-i) which is a singleton (instead of a vector) since 
m; = 1 (for completeness, let r$ = r<j). This is a perfect information game [7] 
since a single player moves at each stage and has complete information about 
the actions of all players at previous stages. Hence, backward induction may be 
used to construct the subgame-perfect equilibrium. 

We construct the equilibrium strategy s* inductively as follows: first, for all 
players i, let s*(x) = when x < c (where c is assumed to be 1). Then assume 
that s*(x) is defined for all x < r and for all i. Obviously, with this information, 
every player i may compute 6i(x 7 s*L i ) for all x < r. This is simply due to 
the fact that 5i depends on the downstream players from i who must play an 
action or reward strictly less than r. Finally, for all players i we let s*(r) = 
argmax 2; (r — x)di(x, s^_j) where x < r. 

Theorem 2 The strategy profile s* is a subgame-perfect equilibrium. 

Sketch of Proof The proof for the line is straightforward and follows from 
backward induction by constructing the optimal strategies starting with the 
last player (player K) first, then the next-to-last, and so on up to player 1. 
The strategies are optimal for every history (by construction) and given the 
utility function defined in Equation ([I]), no player can move profitably. Notice 
that in general when r next (R.} < c, propagation of the reward will stop simply 
because at equilibrium no player will want a negative utility and will prefer to 
not participate instead (the case with the leaf player). □ 

The proof may be directly extended to the tree since each player in the tree has 
a single upstream parent as well and backward induction follows in the same 
way. On the tree, the strategies of the players that play simultaneously at each 
stage are also independent. 



3.1 Competition: the ring 

As opposed to the line, we present next a negative result for the ring graph 
(simply referred to as the ring hereafter). In a ring, each player has a degree of 
2 and m, = 1 again for all players except the leaf player. We consider rings with 



an even number of nodes due to the direct competition dynamics. Figure 1(b) 



shows the 2-stagc, the 3-stage, and general if-stage versions of the game. In the 
multi-stage game, after observing r<j, players 1 and 2 play simultaneously at stage 



1 promising rewards r\ and T2 respectively to their downstream children, and so 
on. We shall refer to the players at stage j using ids 2j — 1 and 2j where the 
stage of a player i, denoted as may be computed from the id as = [~|~|. 
For the rest of the discussion, we assume WLOG that the player at stage K 
(with id 2K — 1) breaks ties by picking the route through the left parent 2K — 3. 
For the 2-stage game in Figure l(b)[ i), it is easy to show that an equilibrium 



always exists in which s*(r<j) = s^ird) = (fd — 1) when > 1 and otherwise. 
This means that player 3 enjoys the benefits of perfect competition due to the 
Bcrtrand-style competition [7] between players 1 and 2. The equilibrium in this 
game is independent of player 3's preference for breaking ties. We now present 
the following negative result. 



Claim 1 The 3-stage game induced on the ring ( of Figure l(b)\ ii)) does not have 



a subgame-perfect equilibrium. Particularly, there exists a class of subgames for 
h 1 = rd > 5 for which there is no Nash equilibrium. 

Sketch of Proof The proof makes use of a counterexample. Using the backward 
induction argument, notice first that the best strategy of players 3 and 4 is to 
play a Bertrand- style competition as follows: after observing a 1 = (fi,^), player 
3 plays r3 = when r% = 1, = min(ri — 1, r%— 1) when both r\ > 1 and ri > 1, 
and r3 = 1 when T\ > 1 and r% = 1. Player 4 plays symmetrically. Knowing that, 
players 1 and 2 will choose their strategies simultaneously and no equilibria exist 
for rd > 5 due to oscillation of the best-response dynamics. This may be shown 
by examining the strategic form game, in normal/matrix form, between players 
1 and 2 (in which the utilities are expressed in terms of rd). We briefly show 
the subgame for rd = 6 and we leave the elaborate proof as an exercise for the 
interested reader. Figure [2] shows the payoff matrix of players 1 and 2 for playing 
actions r\ G {2,3} (rows) and r^ £ {1,3} (columns), respectively. The payoff 
shown is taken to be u.i = (r^ — r^ )^ ignoring the first term of Equation ([T]). 
The actions shown are the only remaining actions after applying iterated strict 
dominance i.e. all other possible actions for the players are strictly dominated. 
Clearly, no pure strategy Nash equilibria exist. The argument could be directly 
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Fig. 2. The payoff matrix of players 1 and 2 for the 3-stage game on the ring of 
Figure l(b)[ ii) when = 6. 



extended to any rd > 5 since player 2 will still have the incentive to oscillate. 

□ 

The value rd > 5 signifies the breaking point of equilibrium or the reward at 
which player 2, when maximizing her utility (rd — J^)^, will always oscillate 



between competing for 5 (by playing large T2) or not (by playing small rz). 
Hence, under the linear utility given in Equation (|TJ) , an equilibrium does not 
exist on the simple ring. This negative result for the game induced on the 3- 
stage ring may be directly extended to the general game for the -ftT-stagc ring 
by observing that a class of subgames G{h K ~ 2 ) of the general if-stage game are 
identical to the 3-stage game. While the full game does not have an equilibrium 
for K > 2 stages, we shall show next that there always exists an equilibrium for 
the special subgame G(r* d ) (for h 1 = r* d ), where the reward r* d is the minimum 
incentive to guarantee that cf s route is globally distributed at equilibrium. We 
define and compute r* d next before constructing the equilibrium. 



3.2 Growth of Incentives, and a Special Subgame 

We next answer the following question: Find the minimum incentive r* d , as a 
function of the depth of the network K (equivalently the number of stages in 
the multi-stage game), such that there exists an equilibrium outcome for the 
subgame G(r d ) that is a spanning tree. We seek to compute the function / such 
that r d = f(K). First, we present a result for the line, before extending it to the 
ring. On the line, K is simply the number of players i.e. K = n. 

Lemma 1. On the line graph, we have /(0) = 0, /(l) = 1, /(2) = 2, and 
V k > 2 

f(k) = (k- l)f(k - 1) - (k - 2)f(k - 2) (2) 

The proof is presented in Appendix B] Notice that f(K) grows exponentially 
with the depth K of the line network Q By subtracting fik — 1) from both sides 
of the recurrence relation, it may be shown that 

f(k) - f(k - 1) = (k - 2)! (3) 



We now revisit the the Jf-stagc game of Figure 1(b) 'iii) on the ring and we focus 
on a specific subgame which is the restriction of the full game to hi = r* d — f(K), 
and we denote this subgame by G(r d ). Consider the following strategy profile s* 
for the subgame: players at stage j play sjjy-iC^ 5 ') = /(-^ — j): anc ^ s 2j(&) = 
f(K -j - 1), V 1 < j <K-1, and let s* 2K ^{h K ) = 0. 

Theorem 3 The profile s* is a Nash equilibrium for the subgame G(r d ) on the 
K -stage ring, V K > 2. 



7 On the other hand, on complete d-ary trees, it may be shown that the function 
f(k) = 0{k) = (9(log d n) for d > 2 since the number of players, and hence 8i, grows 
exponentially with depth K. These growth results on the line graph and the tree seem 
parallel to the result of Kleinberg and Raghavan [12] (and the elaboration in [1]) which 
states that the reward required by the root player in order to find an answer to a 
query with constant probability grows exponentially with the depth of the tree when 
the branching factor of the tree is 1 < b < 2 i.e. when each player has an expected 
number of offsprings 1 < b < 2, while it grows logarithmically for b > 2. 



The proof is presented in Appendix[UJ This result may be interpreted as follows: 
if the advertiser were to play strategically assuming she has a marginal utility 
of at least r d and is aiming for a spanning tree (global discoverability), then 
r* ; = f{K) will be her Nash strategy in the game induced on the if-stage ring, 
V K > 2 (given s*). 

We have shown in Lemma (flj that the the minimum incentive r* d on the line 
(such that there exists an equilibrium spanning tree for the subgame G{r* d )) as a 
function of depth K is r* d = f(K). We now extend the result to the ring denoting 
by f r (K) the growth function for the ring in order to distinguish it from that of 
the line, f(K). 

Corollary 1. On the ring graph, we have f r {k) = f(k) as given by Lemma if7]). 

Sketch of Proof We have shown in Theorem that s* is a an equilibrium 
for the subgame G(r d ) for r* d = f(K) and that the equilibrium is a spanning 
tree. What remains to show is that f{K) is the minimum incentive required. 
This follows by isolating the left branch of the ring, which is a line graph that 
constitutes of player d and all the players with odd identifiers, and using the 
same argument of Lemma on this branch: an < f(K) allows player 1 to 
move profitably by playing an n < f{K — 1) which violates the spanning tree 
requirement (by definition of /). □ 

4 Discussion 

The Nash equilibria constructed in this paper are not unique. It is addition- 
ally well known that in a multi-stage game setting, the Nash equilibrium notion 
might not be "credible" as it could present suboptimal responses to histories 
that would not occur under the equilibrium profile [7], rendering subgame per- 
fect equilibria more suitable in such circumstances. All the Nash equilibria that 
we have constructed are credible and are consistent with backward induction for 
the respective histories of the subgames studied. A distinct aspect of our game is 
that a player i at stage k may not carry an empty threat to an upstream parent 
at stage k — 1, since player i's actions arc constrained by the parent's action 
as dictated by the network structure and the decreasing rewards assumption. 
In this paper, we have studied the equilibria existence question only. Other im- 
portant questions include quantifying how hard is it to find the equilibria, and 
devising mechanisms to get to them. These questions, in addition to extending 
the results to general network structures and relaxing the fixed cost assumption, 
are part of our ongoing work. 

While the distributed incentive model has advantages over centralized mecha- 
nisms that rely on a "designer" , the model might suffer from exponential growth 
of rewards which could potentially make it infeasible for sparse and large di- 
ameter networks. Quantifying the suitability of this model to general network 
structures and to the Internet connectivity graph specifically requires further 
investigation. Interestingly, while it is a complex network, the Internet's connec- 
tivity graph is a small-world network i.e. the average distance between any two 



nodes on the Internet is small [2]. 

Finally, we have only considered the setting in which d's marginal utility is con- 
stant which seems intuitive in a BGP setting where global reachability is the 
goal, since every node in the DFZ must keep state information about d or else 
the latter will be unreachable from some parts of the network. Other economic 
models that assume the network is a market with elastic demand (based on d's 
utility) and that determine prices based on demand and supply, are interesting 
to investigate. They may even be more intuitive in settings where it makes sense 
to advertise (or sell) a piece of information to a local neighborhood. 
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A Proof of Theorem [I] 

Proof. The case for \B(i)\ = 1 is trivial. The case for \B(i)\ = 2 is trivial as 
well since i will not be able to make a sale to the higher reward neighbor by 
picking the lower reward offer. Assume that node i has more than 2 neighbors 
and that any two neighbors, say k, I advertise routes Rk,Ri £ P(i) s -t- k = 
next(Rk),l = next(Ri) and r^i < ru, and assume that i's utility for choosing 
route Rk over Ri either increases or remains the same i.e. u^ h > u^ 1 . We will 
show by contradiction that neither of these two scenarios could happen, 
scenario 1: uf h > uf l From Equation (Q}, it must be the case that either (case 
1) node i was able to make at least one more sale to some neighbor j who would 
otherwise not buy, or (case 2) some neighbor j who picks (j,i)Ri can strictly 
increase her 8j(s) when i chooses the lower reward path Rk- For case 1, and 
assuming that is the same when i chooses either route, it is simple to show 
that we arrive at a contradiction in the case when j £ {fc, /} (mainly due to the 
strictly decreasing reward assumption i.e. r, < r nex t(.)); and in the case when 
j £ {k, I}, it must be the case that j's utility increases with i's route choice i.e. 
Uj k > 1 . This contradicts with Equation (JTJ) since w.r.t. j, both routes 

have the same next hop node i. The same analogy holds for case 2. 
scenario 2: uf k = uf" 1 Using the same analogy of scenario 1, there must exist 
at least one neighbor j of i that would buy Vs offer only when the latter picks 
Rk, or otherwise node i will be able to strictly increase its utility by picking Ri 
pocketing more profit. 

B Proof of Lemma [j] 

Proof. First, /(0) = 0, f(l) = 1 and f(2) = 2 are trivially true given the utility 
function of Equation (JTJ) . The proof uses induction on the depth of the network. 
First, for the base case k = 3, in the 3-stage line the Nash equilibrium is for 
player 1, the player at distance 1 from d, to play r*i = 2 and for player 2 to 
play r 2 = 1 (in every NE, Sj(l) = 0, Vi). Given r* d = /(3) = 3, the utility of 
player 1 is u\ = (3— 2)2 > (3 — r^S^, Vrj < 3. Similarly player 2 may not move 
profitably from playing r 2 = 1. 

Assume f(x) = (x — l)f(x — 1) — (x — 2)f(x — 2) holds V x < k. We construct 
the fc-stage game from the [k — l)-stage game by adding a node/player between 
node d and node 1 in the (k — l)-stage game. Notice the player 2 in the fc-stage 
game used to be player 1 in the (k — l)-stage game. By definition of /, in the 
fc-stage game, when player 1 plays r\ = f(k — 1) then Si = (k — 1) and no player 



2,2 < i < k may deviate profitably from playing = f(k — i). Here r\ = f(k — 1) 
is the minimum reward to get a Si = (k — 1). In general, it holds by construction 
of / that there are k possible outcomes for player 1, corresponding to the values 
Si = 0,1, . . . ,k — 1. For each of these outcomes, we have an action for player 1, 
Ti = f{x), which results in the outcome tree corresponding to <5i =i,Vi < k 
and such that no player besides player 1 may deviate profitably contingent on 
player 1 playing ri — f(x) (In this outcome player i plays f(x— V 2 < i < n). 
In order for Si = k — 1 to be the equilibrium outcome, it must be the case that 
r i = f(k — 1) maximizes player l's utility given r d (and hence no player including 
player 1 may deviate profitably) i.e. it must be that V 2 < j < k 

(r d - f(k - l))(k - 1) > (r d - f(k - j))(k - j) 

This condition is equivalent to: 

(r d - f(k - l))(k - 1) > (r d - f(k - 2)){k - 2) (4) 

since (r d - f(k - 2)){k - 2) > (r d - f(k - j))(k - j),V 3 < j < k and for 
rd > f(k - 1). Equation g]) implies that r d > (k - l)f(k - 1) - (k - 2)f(k - 2). 
The minimum such incentive is: 

r* = f(k) = (k - l)/(fc - 1) - (k - 2)f(k - 2) (5) 

which is greater than f(k — 1) concluding the proof. □ 

C Proof of Theorem [3] 

Proof. Notice first that the complete history h K+1 which corresponds to r* d and 
s* is an outcome that is a spanning tree (each player picks the best route through 
the upstream parent while the last player 2K — 1 prefers the left parent who is 
promising a higher reward). We will show that no player i can deviate from 
playing s* given by considering the players at each stage j, V 2 < j < K — 1 
first and then we extend the reasoning to the players at stage 1. For the players 
at stage j we show that player 2j — 1 may not deviate profitably from playing 
S2 J -_x(/i J ) = i~2j-i = f(K — j) given the strategies of the rest of the players 
(particularly given s^j^h 1 ) = T2j = f{K — j — 1)), and the same for player 2j. 
Given that rij < V2j—i (i-e. player 2j not competing with player 2j — 1), then 
by construction of the function /, there exists an outcome on the ring such that 
#2j-i = K — j when T2j— i = f(K — j) and T2j < T2j—i (this holds at each stage 
2 < j < K — 1 given the tie-breaking preference of player 2K — 1). The utility 
then to player 2j — 1 of playing r2j-i = f(K — j) is: 



u 2j -i = (f(K -3 + 1)- f(K - j))(K - j) (6) 
= (f(K - j + 1) - f(K - j - 1))(K - j - 1) (7) 
= (K-j)\ (8) 



where the second equality holds by definition of function / (Equation ©) and 
the third equality holds because (f(K) -f(K -2)) (K -2) = (f(K) — f(K—l) + 
f(K-l)-f(K-2)){K-2)=((K-2)\ + (K-3)\)(K-2) = (K-l)\. Given the 
strategies of the rest of the players, player 2j — 1 may not deviate profitably i.e. 
U2j-i(f(K - j)> s *(2j-i)) - u 2 3 -i(r', s* ( 2 3 -_x)), V r' ^ /(/v - j). This is simply 
because playing an r' > f(K—j) will strictly decrease «2j-i since <52j-i is already 
maximized ( (52 j — i = K — j in this case), while playing r' < f(K — j) can at best 
yield player 2j — 1 the same utility when r' = f(K — j — 1) (Equation ([7|)). The 
same reasoning holds for player 2j who may not deviate profitably by playing 
r" f(K — j — 1). Specifically, any r" < f(K — j — 1) can at best yield player 2j 
the same utility when r" = f(K — j — 2), and in order to compete with player 
2j — 1 (and possibly increase 62 j) player 2j must play r" > T23-1 = f(K — j) 
which violates the decreasing rewards assumption. Hence neither player at stage 
j may deviate profitably for all 2 < j < K — 1. It remains to show that players 
at stage 1 may not deviate profitably. First, player 1 may not deviate profitably 
using the same argument we used for player 2j — 1 where j = 1. The utility to 
player 1 is ui{f(K — 1), s*_i) = (K — 1)1. On the other hand, player 2 gets the 
same utility as player 1 where u 2 (f(K - 2), s*_ 2 ) = (f(K) - f(K - 2)){K -2) = 
(K — 1)!. In the same way, player 2 may not deviate profitably since playing 
any r' 2 ^ f(K — 2) may not increase u 2 given sl 2 - More clearly, in order for 
player 2 to compete with player 1 and possibly increase 62 from K — 2 to K — 1 , 
player 2 must play an r' 2 > /(if — 1) which in the best case yields a utility 
^2(^2' s -2) = (/ C^Q ~ r 2)(K — 1) < (K ~ 1)" Hence, neither player 1 nor player 
2 may deviate profitably given the strategies of the other players. Finally, the 
case for player 2K — 1 is trivial. This concludes the proof. □ 



